AITopics | fraction 0

Collaborating Authors

fraction 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reward Transfer from Inverse Reinforcement Learning: A Coupled Minimax Approach

Hao, Guang-Yuan, van der Laan, Lars, Bibaut, Aurélien, Kallus, Nathan

arXiv.org Machine LearningMay-28-2026

Expert demonstrations, such as those from car drivers, help navigate environments with unknown rewards, but are often collected in controlled settings, such as closed-course test tracks, while learned control policies must be deployed in new environments, such as city streets. We can imitate experts to perform well in the same source environment where demonstrations are observed, and we may even use inverse reinforcement learning (IRL) to improve on simple behavior cloning (Ng and Russell, 2000; Abbeel and Ng, 2004; Ziebart et al., 2008; Fu et al., 2018; Geng et al., 2020). But the target environment may have a different transition law, discount factor, or soft-control regularization. For this, IRL is crucial: we can learn a reward from demonstrations in the source environment and transfer it to the target environment, learning a policy that optimizes the same reward function in a new setting (Fu et al., 2018; Schlaginhaufen and Kamgarpour, 2024). In this paper, we characterize how well this transfer can be done and which approaches are preferable. In particular, we show the value in a coupled approach that takes the target environment into account even when learning from the source. In ordinary offline control, the Bellman equation uses a known reward, so the main statistical error comes from target transitions.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

2605.27834

Genre: Research Report (0.63)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

7fd3b80fb1884e2927df46a7139bb8bf-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 03:33:11 GMT

The IDs of the 10 datasets used in this work, as well as the number of examples and features, are provided in Table 1 in the main manuscript. All of the datasets correspond to binary classification problems, with varying degrees of class imbalance. While the prediction is always performed in the logarithmic domain, when evaluating the models we transform both the labels and the model predictions back into their original domain. The loss function used for training and evaluation is the standard root mean-squared error (sklearn.metrics.mean_squared_error). We download the raw data programmatically using the Kaggle API, which produces the filetrain.tsv.

artificial intelligence, machine learning, subsample 0, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MathCAMPS: Fine-grained Synthesis of Mathematical Problems From Human Curricula

Mishra, Shubhra, Poesia, Gabriel, Mo, Belinda, Goodman, Noah D.

arXiv.org Artificial IntelligenceJun-30-2024

Mathematical problem solving is an important skill for Large Language Models (LLMs), both as an important capability and a proxy for a range of reasoning abilities. Existing benchmarks probe a diverse set of skills, but they yield aggregate accuracy metrics, obscuring specific abilities or weaknesses. Furthermore, they are difficult to extend with new problems, risking data contamination over time. To address these challenges, we propose MathCAMPS: a method to synthesize high-quality mathematical problems at scale, grounded on 44 fine-grained "standards" from the Mathematics Common Core (CC) Standard for K-8 grades. We encode each standard in a formal grammar, allowing us to sample diverse symbolic problems and their answers. We then use LLMs to realize the symbolic problems into word problems. We propose a cycle-consistency method for validating problem faithfulness. Finally, we derive follow-up questions from symbolic structures and convert them into follow-up word problems - a novel task of mathematical dialogue that probes for robustness in understanding. Experiments on 23 LLMs show surprising failures even in the strongest models (in particular when asked simple follow-up questions). Moreover, we evaluate training checkpoints of Pythia 12B on MathCAMPS, allowing us to analyze when particular mathematical skills develop during its training. Our framework enables the community to reproduce and extend our pipeline for a fraction of the typical cost of building new high-quality datasets.

accuracy, mathcamps, word problem, (15 more...)

arXiv.org Artificial Intelligence

2407.009

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > K-12 Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Accelerating gradient-based topology optimization design with dual-model neural networks

Qian, Chao, Ye, Wenjing

arXiv.org Artificial IntelligenceSep-14-2020

Topology optimization (TO) is a common technique used in free-form designs. However, conventional TO-based design approaches suffer from high computational cost due to the need for repetitive forward calculations and/or sensitivity analysis, which are typically done using high-dimensional simulations such as Finite Element Analysis (FEA). In this work, neural networks are used as efficient surrogate models for forward and sensitivity calculations in order to greatly accelerate the design process of topology optimization. To improve the accuracy of sensitivity analyses, dual-model neural networks that are trained with both forward and sensitivity data are constructed and are integrated into the Solid Isotropic Material with Penalization (SIMP) method to replace FEA. The performance of the accelerated SIMP method is demonstrated on two benchmark design problems namely minimum compliance design and metamaterial design. The efficiency gained in the problem with size of 64x64 is 137 times in forward calculation and 74 times in sensitivity analysis. In addition, effective data generation methods suitable for TO designs are investigated and developed, which lead to a great saving in training time. In both benchmark design problems, a design accuracy of 95% can be achieved with only around 2000 training data.

deep learning, poisson, upstream oil & gas, (17 more...)

arXiv.org Artificial Intelligence

2009.06245

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning Latent Dynamics for Planning from Pixels

Hafner, Danijar, Lillicrap, Timothy, Fischer, Ian, Villegas, Ruben, Ha, David, Lee, Honglak, Davidson, James

arXiv.org Artificial IntelligenceNov-11-2018

Planning has been very successful for control tasks with known environment dynamics. To leverage planning in unknown environments, the agent needs to learn the dynamics from interactions with the world. However, learning dynamics models that are accurate enough for planning has been a long-standing challenge, especially in image-based domains. We propose the Deep Planning Network (PlaNet), a purely model-based agent that learns the environment dynamics from pixels and chooses actions through online planning in latent space. To achieve high performance, the dynamics model must accurately predict the rewards ahead for multiple time steps. We approach this problem using a latent dynamics model with both deterministic and stochastic transition function and a generalized variational inference objective that we name latent overshooting. Using only pixel observations, our agent solves continuous control tasks with contact dynamics, partial observability, and sparse rewards. PlaNet uses significantly fewer episodes and reaches final performance close to and sometimes higher than top model-free algorithms.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1811.04551

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (0.68)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Using deep learning for comprehensive, personalized forecasting of Alzheimer's Disease progression

Fisher, Charles K., Smith, Aaron M., Walsh, Jonathan R., Diseases, the Coalition Against Major

arXiv.org Machine LearningJul-10-2018

A patient is more than one number, yet most approaches to machine learning from electronic health data can only predict a single endpoint. Here, we present an alternative -- using unsupervised deep learning to simulate detailed patient trajectories. We use data comprising 18-month longitudinal trajectories of 42 clinical variables from 1908 patients with Mild Cognitive Impairment (MCI) or Alzheimer's Disease (AD) to train a model for personalized forecasting of disease progression. Our model simulates the evolution of each sub-component of cognitive exams, laboratory tests, and their associations with baseline clinical characteristics, generating both predictions and their confidence intervals. Even though it is not trained to predict changes in disease severity, our unsupervised model predicts changes in total ADAS-Cog scores with the same accuracy as specifically trained supervised models. We show how simulations can be used to interpret our model and demonstrate how to create synthetic control arm data for AD clinical trials. Our model's ability to simultaneously predict dozens of characteristics of a patient at any point in the future is a crucial step forward in computational precision medicine.

artificial intelligence, fraction 0, machine learning, (19 more...)

arXiv.org Machine Learning

1807.03876

Country:

Asia (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
South America (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Conditional molecular design with deep generative models

Kang, Seokho, Cho, Kyunghyun

arXiv.org Machine LearningApr-30-2018

Although machine learning has been successfully used to propose novel molecules that satisfy desired properties, it is still challenging to explore a large chemical space efficiently. In this paper, we present a conditional molecular design method that facilitates generating new molecules with desired properties. The proposed model, which simultaneously performs both property prediction and molecule generation, is built as a semi-supervised variational autoencoder trained on a set of existing molecules with only a partial annotation. We generate new molecules with desired properties by sampling from the generative distribution estimated by the model. We demonstrate the effectiveness of the proposed model by evaluating it on drug-like molecules. The model improves the performance of property prediction by exploiting unlabeled molecules, and efficiently generates novel molecules fulfilling various target conditions.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1805.00108

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback